Asynchronous Large-Scale Graph Processing Made Easy

نویسندگان

  • Guozhang Wang
  • Wenlei Xie
  • Alan J. Demers
  • Johannes Gehrke
چکیده

Scaling large iterative graph processing applications through parallel computing is a very important problem. Several graph processing frameworks have been proposed that insulate developers from low-level details of parallel programming. Most of these frameworks are based on the bulk synchronous parallel (BSP) model in order to simplify application development. However, in the BSP model, vertices are processed in fixed rounds, which often leads to slow convergence. Asynchronous executions can significantly accelerate convergence by intelligently ordering vertex updates and incorporating the most recent updates. Unfortunately, asynchronous models do not provide the programming simplicity and scalability advantages of the BSP model. In this paper, we combine the easy programmability of the BSP model with the high performance of asynchronous execution. We have designed GRACE, a new graph programming platform that separates application logic from execution policies. GRACE provides a synchronous iterative graph programming model for users to easily implement, test, and debug their applications. It also contains a carefully designed and implemented parallel execution engine for both synchronous and user-specified built-in asynchronous execution policies. Our experiments show that asynchronous execution in GRACE can yield convergence rates comparable to fully asynchronous executions, while still achieving the near-linear scalability of a synchronous BSP system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Giraph Unchained: Barrierless Asynchronous Parallel Execution in Pregel-like Graph Processing Systems

The bulk synchronous parallel (BSP) model used by synchronous graph processing systems allows algorithms to be easily implemented and reasoned about. However, BSP can suffer from poor performance due to stale messages and frequent global synchronization barriers. Asynchronous computation models have been proposed to alleviate these overheads but existing asynchronous systems that implement such...

متن کامل

An asynchronous traversal engine for graph-based rich metadata management

Rich metadata in high-performance computing (HPC) systems contains extended information about users, jobs, data files, and their relationships. Property graphs are a promising data model to represent heterogeneous rich metadata flexibly. Specifically, a property graph can use vertices to represent different entities and edges to record the relationships between vertices with unique annotations....

متن کامل

IEEE Workshop on VLSI Signal Processing Performance Analysis of Mixed Asynchronous Synchronous Systems

The paper is concerned with the timing analysis of a class digital systems we call mixed asynchronous{synchronous systems. In such a system, each computation module is either synchronous (i.e. clocked) or asynchronous (i.e. selftimed). The communication between modules is assumed to be selftimed for all modules. We introduce a graph model called MASS for describing the timing behaviour of such ...

متن کامل

An implementation for complete asynchronous

We expand an acyclic distributed garbage collector (the cleanup protocol of Stub-Scion Pair Chains) with a detector of distributed cycles of garbage. The whole result is as a complete and asynchronous distributed garbage collector. The detection algorithm for free distributed cycles is inspired by Hughes 5]. A local collector marks outgoing references with dates, which are propagated asynchrono...

متن کامل

Asynchronous Logging and Fast Recovery for a Large-Scale Distributed In-Memory Storage

Large-scale interactive applications and online graph analytic processing require very fast data access to many small data objects. DXRAM addresses these challenges by keeping all data always in memory of potentially many nodes aggregated in a data center. Data loss in case of node failures is prevented by an asynchronous logging on flash disks. In this paper we present the architecture of a no...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013